{ "cells": [ { "cell_type": "markdown", "metadata": { "hideCode": false, "hidePrompt": false }, "source": [ "## 15.2 Data used in our examples\n", "\n", "We will use a dataset that is simulated to represent data from electronic health records for 200,000 patients. The outcome we will consider is whether or not a patient is diagnosed with dementia. In this example, there is an additional complexity because patients were followed up for different amounts of time. A longer follow-up will naturally lead to a higher probability of being diagnosed with dementia. In later modules, we will encounter survival analysis which allows the aspect of time to be accounted for. For now, we will ignore this aspect. \n", "\n", "The code below reads in the dataset and displays the first few rows. " ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "hideCode": false, "hidePrompt": false }, "outputs": [ { "data": { "text/html": [ "
id | prac | pr_lcd | sex | age | bmi | bmi_category | consultations | agegp | alcohol | ... | mortality | date_death | timetodementia | dementia | date_dementia | end_date | dob | rsample | vitd | lp |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
23189 | 142 | 08dec2009 | 1 | 53 | 20.4 | Normal (18.5-<25) | 12 | 50 | 3-6 units/day | ... | 0 | NA | 0 | 08dec2009 | 01nov1941 | 1 | NA | -0.8153054 | ||
92186 | 132 | 03feb2003 | 0 | 73 | 21.5 | Normal (18.5-<25) | 4 | 70 | <2 units/day | ... | 0 | NA | 0 | 03feb2003 | 16jan1928 | 1 | NA | -1.2268275 | ||
187963 | 43 | 06jul2001 | 0 | 40 | 27.1 | Overweight (25-<30) | 0 | 40 | <2 units/day | ... | 0 | NA | 0 | 06jul2001 | 18jun1961 | 1 | NA | -0.6602434 | ||
148379 | 215 | 08mar2012 | 1 | 40 | 20.9 | Normal (18.5-<25) | 3 | 40 | <2 units/day | ... | 0 | NA | 0 | 08mar2012 | 10feb1952 | 1 | 23.22692 | -0.9507329 | ||
44194 | 225 | 02feb2011 | 1 | 92 | 32.5 | Obese class I (30-<35) | 10 | 90 | Non drinker | ... | 0 | NA | 0 | 02feb2011 | 09dec1912 | 1 | NA | 1.0403746 | ||
169915 | 175 | 02nov2011 | 1 | 55 | 26.3 | Overweight (25-<30) | 3 | 55 | 3-6 units/day | ... | 0 | NA | 0 | 02nov2011 | 06oct1946 | 1 | NA | -0.1080445 |